Unit selection in concatenative TTS synthesis systems based on mel filter bank amplitudes and phonetic context
نویسندگان
چکیده
In concatenative text-to-speech (TTS) synthesis systems unit selection aims to reduce the number of concatenation points in the synthesized speech and make concatenation joins as smooth as possible. This research considers synthesis of completely new utterances from non-uniform units, whereby the most appropriate units, according to acoustic and phonetic criteria, are selected from a myriad of similar speech database candidates. A Viterbi-style algorithm dynamically selects the most suitable database units from a large speech database by considering concatenation and target costs. Concatenation costs are derived from mel filter bank amplitudes, whereas target costs are considered in terms of the phonemic and phonetic properties of required units. Within subjects and between subjects ANOVA [9] evaluation of listeners’ scores showed that the TTS system with this method of unit selection was preferred in 52% of test sentences.
منابع مشابه
ACTOR: A multilingual unit-selection speech synthesis system
The ACTOR® Text-To-Speech (TTS) synthesis system, developed at Loquendo S.p.A., is here described. The system employs a unit -selection concatenative synthesis technique, relying on labeled acoustic databases providing phonetic and prosodic coverage of the intended language/domain and on an original algorithm for run-time selection of the acoustic units to be concatenated. This technique yields...
متن کاملAdvances in Spectral Parameterization for Statistical (HMM-Based) TTS
HMM-based parametric speech synthesis has recently become an alternative to the concatenative TTS approach, especially when low footprint and general speech domain are required. A majority of speech parameterization models used in state-ofthe art HMM TTS systems employ source-filter waveform synthesis schemes. Sinusoidal representation and waveform generation of speech is an alternative to the ...
متن کاملAn embedded and concatenative approach to TTS of multiple languages
In thi and appro for E (Es.), efficie archit embe can b comm select text p the la are u letterspeec etc., a This paper presents an embedded and concatenative approach to multilingual text-to-speech system (ECMTTS). Under a uniform architecture, the TTS modules are separated into language dependent and independent ones. A specifically defined super phonetic symbol set enables to use uniform spee...
متن کاملA Corpus-Based Concatenative Speech Synthesis System for Turkish
Speech synthesis is the process of converting written text into machine-generated synthetic speech. Concatenative speech synthesis systems form utterances by concatenating pre-recorded speech units. Corpus-based methods use a large inventory to select the units to be concatenated. In this paper, we design and develop an intelligible and natural sounding corpus-based concatenative speech synthes...
متن کاملData pruning approach to unit selection for inventory generation of concatenative embeddable Chinese TTS systems
In this paper, a data pruning approach is presented for building acoustic unit inventory for syllable-based concatenative embeddable Chinese TTS system. A 3-portion segmentation of a syllable is proposed based on the nature of voiced/unvoiced structure of Chinese syllable. Individual factorial acoustic measurement of syllable is used to calculate the penalty of perceptual unsatisfactory for con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003